# Define base_dir for consistent path management
from pathlib import Path
import os
notebook_dir = Path(os.getcwd()).resolve()
base_dir = notebook_dir.parent
print(base_dir)/home/cgoehler/team-extra/ndvi-time-series-prediction
This notebook outlines the complete workflow for processing NDVI (Normalized Difference Vegetation Index) data from the deep-extremes-minicubes dataset provided by the Remote Sensing Centre for Earth System Research. The steps include data preprocessing, dataset initialization, data processing, and sanity checks. Additionally, we perform a train/test split based on the analysis of missing values and implement strategies to handle these missing values, preparing the data for training.
The DeepExtremes project provides a dataset of minicubes, which are small, manageable data cubes that contain various types of remote sensing data. Each minicube is a 3D array with dimensions (495 x 128 x 128) representing time, latitude, and longitude, and they are designed to facilitate the study of extreme events and their impacts on the Earth system.
Each minicube is rich with metadata, including details on the geographic location, creation date, data processing steps, and variable-specific attributes. This metadata ensures the data’s integrity, traceability, and usability for scientific analysis.
First we import the necessary packages and the deep-extremes-minicubes dataset from the S3 Bucket, remove invalid cubes and split the dataset.
# Define base_dir for consistent path management
from pathlib import Path
import os
notebook_dir = Path(os.getcwd()).resolve()
base_dir = notebook_dir.parent
print(base_dir)/home/cgoehler/team-extra/ndvi-time-series-prediction
# Import necessary packages and custom functions
import sys
sys.path.insert(0, os.path.join(base_dir, "src", "data_processing"))
import s3fs
import itertools
import zarr
import math
import numpy as np
import matplotlib.pyplot as plt
import xarray as xr
import numpy as np
from import_cubes import *
from helper import *
import torch
from my_loader import DeepCubeTSDatasetBasti
from process_ndvi import *
from sanity_checks import *
from stl_interpolation import *
from statsmodels.tsa.seasonal import STL# AWS Credentials
AWS_ACCESS_KEY_ID = "***"
AWS_SECRET_ACCESS_KEY = "***"
AWS_DEFAULT_REGION = "eu-central-1"# Initialize S3FileSystem
minicubefs = s3fs.S3FileSystem(key=AWS_ACCESS_KEY_ID, secret=AWS_SECRET_ACCESS_KEY)# Read registry from S3FileSystem
bucket_path = "s3://deepextremes-minicubes/1.2.2"
registry_df = read_registry(bucket_path, minicubefs)
registry_df.shape(5593, 1)
# Remove bad cubes
print("Initial number of Cubes: ", registry_df.shape[0])
registry_df_filtered = remove_cubes(registry_df)
print("Number of Cubes after removal: ", registry_df_filtered.shape[0])Initial number of Cubes: 5593
Number of Cubes after removal: 5397
To ensure a balanced data distribution over the entire globe while reducing the total number of cubes, we use the split provided in the split_table.csv file.
Subsequently, we will only utilize the cubes within the trainingset. Since this is a time-series prediction, our train/validation/test split will be applied separately to each cube by splitting the time series of each cube individually.
# Split cubes in training, validation and testset
split_table_path = base_dir / "csvs" / "split_table.csv"
traincubes, valcubes, testcubes = split_datasets(
cube_registry=registry_df_filtered, split_table_path=split_table_path
)/home/bastiloeblein/team-extra/ndvi-time-series-prediction/Notebooks/../src/data_processing/import_cubes.py:88: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
cube_registry["mc_id"] = cube_registry["mc_id"].apply(preprocess_mc_id)
print(f"Number of entries in the training dataset: {len(traincubes)}")
print(f"Number of entries in the validation dataset: {len(valcubes)}")
print(f"Number of entries in the test dataset: {len(testcubes)}")Number of entries in the training dataset: 3052
Number of entries in the validation dataset: 666
Number of entries in the test dataset: 97
# Define the S3 bucket name
s3_bucket = "deepextremes-minicubes/1.2.2"# Get quantile data
quantile_data = get_var_quantiles()
# dict(itertools.islice(quantile_data.items(), 1))We initialize the dataset containing minicubes (each with dimension 495 (time periods) x 128 x 128 (pixels)) with additional information on pixel-wise NDVI and cube classes (e.g. soil, meadow).
Due to data quality and memory problems we decided to only use 100 cubes. Therefore we identified the 100 cubes, which have the most time periods containing >80% pixels with not nan values.
valid_cubes_path = base_dir / "csvs" / "Final_Cubes.csv"
valid_df =pd.read_csv(valid_cubes_path, sep = ";")
valid_indices = valid_df["Cube_ID"].to_list()# Initialize dataset
dataset = DeepCubeTSDatasetBasti(
minicubefs, bucket_path, traincubes, quantile_data, valid_indices
)dataset.__repr__<bound method DeepCubeTSDatasetBasti.__repr__ of <(128 * 128 data points per cube): 16.384 * 100 (number of cubes) = 1638400 (Total number of data points)>>
The data is processed in chunks to manage memory usage.
NDVI values are preprocessed to mask pixels where NDVI prediction is not meaningful (e.g. pixels that do not contain any vegetation). Therefore, we will mask pixels that have an average NDVI value below 0.2 (https://earthobservatory.nasa.gov/features/MeasuringVegetation) over the entire time period.
# Initialize an empty dictionary to store the NDVI data
ndvi_data = {}
# Process the cubes for the valid indices
for i in valid_indices:
print(i)
ndvi_tensor, cloudmask_tensor, lat, lon, cube_class = dataset.__getitem__(i)
# Calculate the average NDVI for each pixel
average_ndvi = calculate_average_ndvi_for_each_pixel(ndvi_tensor)
# Mask low NDVI values
ndvi_masked = mask_low_ndvi_values(average_ndvi, ndvi_tensor)
# Convert the PyTorch tensors to NumPy arrays
ndvi_data = ndvi_masked.numpy()
cloud_mask_data = cloudmask_tensor.numpy()
# Replace all -9999.0 values with NaN in the NDVI data
ndvi_data = np.where(ndvi_data == -9999.0, np.nan, ndvi_data)
# Generate the dates
dates = pd.date_range(start='2016-01-01', periods=495, freq='5D')
# Create xarray DataArrays
ndvi = xr.DataArray(ndvi_data, coords=[dates, range(128), range(128)], dims=['time', 'x', 'y'], name='NDVI')
cloud_mask = xr.DataArray(cloud_mask_data, coords=[dates, range(128), range(128)], dims=['time', 'x', 'y'], name='Cloud_Mask')
# Combine into a Dataset
data = xr.Dataset({
'NDVI': ndvi,
'Cloud_Mask': cloud_mask
})
# Add lat and lon as attributes
data.attrs['lat'] = lat
data.attrs['lon'] = lon
data.attrs['class'] = cube_class
# Add metadata to the variables and dataset
data['NDVI'].attrs['description'] = 'Normalized Difference Vegetation Index'
data['Cloud_Mask'].attrs['description'] = 'Cloud Mask (0=clear, 1=cloudy)'
data.attrs['source'] = f'Cube_{i}'
# Save to NetCDF
save_dir = base_dir / "data" / "data_final_updated"
data.to_netcdf(save_dir / f'Cube_{i}.nc')
dataAt last we conduct some sanity checks:
Cube Integrity: Ensure that the shape and dimensions of each cube are consistent and correct.
Verify that the number of time steps, rows, and columns match the expected dimensions.
Distribution of NDVI Values: Analyze the distribution of NDVI values to ensure they fall within the expected range [0, 1].
Check for any unexpected outliers or anomalies in the data.
No Pixels with Average NDVI Below 0.2: Ensure that there are no pixels with an average NDVI value below 0.2 over the entire time period.
Comparison with Cube Class: Compare the cube class with the overall average NDVI (considering all pixels and time periods).
Additionally, compare the number of masked values with the cube class to check for consistency.
# Lists to store errors
error_list = []
bad_cubes = []
# Path to the directory with NDVI chunks
data_dir = base_dir / "data" / "data_final_updated"
data_list = os.listdir(data_dir)
print(len(data_list))
# Loop pver all NetCDF files
for nc_file in data_list:
if nc_file.endswith('.nc'):
nc_path = os.path.join(data_dir, nc_file)
print(f"Processing {nc_path}...")
# Load the currend NetCDF file
data = xr.open_dataset(nc_path)
ndvi_data = data['NDVI'].values
cube_class = data.attrs.get('class')
# Expected shape of the NDVI cubes
expected_shape = (495, 128, 128)
try:
# Step 1: Check cube integrity
check_cube_integrity(ndvi_data, expected_shape)
# Step 2: Check the distribution of NDVI values
check_ndvi_distribution(ndvi_data)
# Step 3: Ensure no pixels have an average NDVI below 0.2 and calculate additional metrics
overall_avg_ndvi, num_masked_values = check_if_contains_low_values(ndvi_data)
print(f"Cube {nc_file} - Class: {cube_class}, Overall Average NDVI: {overall_avg_ndvi}, Number of Masked Values: {num_masked_values}")
print(f"Cube {nc_file} passed all sanity checks.\n")
except AssertionError as e:
error_message = f"Cube {nc_file} failed sanity check: {str(e)}"
print(error_message)
error_list.append(error_message)
except Exception as e:
error_message = f"Cube {nc_file} encountered an error: {str(e)}"
print(error_message)
bad_cubes.append(nc_path)
# Display all errors
if error_list:
print("\nSummary of errors:")
for error in error_list:
print(error)
else:
print("\nAll cubes passed sanity checks.")
# Display list of all bad cubes
if bad_cubes:
print("\nList of bad cubes:")
for bad_cube in bad_cubes:
print(bad_cube)In this section, we will perform the train and test split for our time series data. Here’s a structured plan for what we will accomplish:
By carefully selecting a starting point and splitting the data, we aim to enhance the accuracy and reliability of our time series predictions and interpolation of missing values.
file_path = base_dir / "data" / "data_final_updated"
nc_files = [file for file in os.listdir(file_path) if file.endswith('.nc')]
print(f"Number of files: {len(nc_files)}")Number of files: 100
Upon analyzing our dataset, we observed that for the year 2016, there are only sporadic pixels containing NDVI values over all cubes, with the majority being NaNs. To ensure the quality and completeness of our data for time series prediction, we will exclude the year 2016 from our dataset.
# Initialize a dictionary to store the results
results = {}
# Iterate over each .nc file and perform the calculations
for nc_file in nc_files:
full_path = os.path.join(file_path, nc_file)
ds = xr.open_dataset(full_path)
ndvi_data = ds['NDVI'].values # Extract NDVI data
time_data = ds['time'].values # Extract time data
for i, time_step in enumerate(time_data):
# Count the number of non-NaN values for the current time step
non_nan_count = np.sum(~np.isnan(ndvi_data[i, ...]))
# Convert timestamp to readable date format if necessary
if isinstance(time_step, np.datetime64):
time_step = str(time_step)
# Store the results
if time_step in results:
results[time_step] += non_nan_count
else:
results[time_step] = non_nan_count
# Output the results
for time_step, non_nan_count in results.items():
print(f"Date: {time_step} - Number of non-NaN NDVI values: {non_nan_count}")Date: 2016-01-01T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-01-06T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-01-11T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-01-16T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-01-21T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-01-26T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-01-31T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-02-05T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-02-10T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-02-15T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-02-20T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-02-25T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-03-01T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-03-06T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-03-11T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-03-16T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-03-21T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-03-26T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-03-31T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-04-05T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-04-10T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-04-15T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-04-20T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-04-25T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-04-30T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-05-05T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-05-10T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-05-15T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-05-20T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-05-25T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-05-30T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-06-04T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-06-09T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-06-14T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-06-19T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-06-24T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-06-29T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-07-04T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-07-09T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-07-14T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-07-19T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-07-24T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-07-29T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-08-03T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-08-08T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-08-13T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-08-18T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-08-23T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-08-28T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-09-02T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-09-07T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-09-12T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-09-17T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-09-22T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-09-27T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-10-02T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-10-07T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-10-12T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-10-17T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-10-22T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-10-27T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-11-01T00:00:00.000000000 - Number of non-NaN NDVI values: 735
Date: 2016-11-06T00:00:00.000000000 - Number of non-NaN NDVI values: 15236
Date: 2016-11-11T00:00:00.000000000 - Number of non-NaN NDVI values: 16074
Date: 2016-11-16T00:00:00.000000000 - Number of non-NaN NDVI values: 29670
Date: 2016-11-21T00:00:00.000000000 - Number of non-NaN NDVI values: 16149
Date: 2016-11-26T00:00:00.000000000 - Number of non-NaN NDVI values: 1212
Date: 2016-12-01T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-12-06T00:00:00.000000000 - Number of non-NaN NDVI values: 29670
Date: 2016-12-11T00:00:00.000000000 - Number of non-NaN NDVI values: 63
Date: 2016-12-16T00:00:00.000000000 - Number of non-NaN NDVI values: 455
Date: 2016-12-21T00:00:00.000000000 - Number of non-NaN NDVI values: 0
Date: 2016-12-26T00:00:00.000000000 - Number of non-NaN NDVI values: 29644
Date: 2016-12-31T00:00:00.000000000 - Number of non-NaN NDVI values: 154887
Date: 2017-01-05T00:00:00.000000000 - Number of non-NaN NDVI values: 258414
Date: 2017-01-10T00:00:00.000000000 - Number of non-NaN NDVI values: 362279
Date: 2017-01-15T00:00:00.000000000 - Number of non-NaN NDVI values: 813709
Date: 2017-01-20T00:00:00.000000000 - Number of non-NaN NDVI values: 174809
Date: 2017-01-25T00:00:00.000000000 - Number of non-NaN NDVI values: 1231339
Date: 2017-01-30T00:00:00.000000000 - Number of non-NaN NDVI values: 638865
Date: 2017-02-04T00:00:00.000000000 - Number of non-NaN NDVI values: 599125
Date: 2017-02-09T00:00:00.000000000 - Number of non-NaN NDVI values: 441039
Date: 2017-02-14T00:00:00.000000000 - Number of non-NaN NDVI values: 650624
Date: 2017-02-19T00:00:00.000000000 - Number of non-NaN NDVI values: 485039
Date: 2017-02-24T00:00:00.000000000 - Number of non-NaN NDVI values: 553548
Date: 2017-03-01T00:00:00.000000000 - Number of non-NaN NDVI values: 290621
Date: 2017-03-06T00:00:00.000000000 - Number of non-NaN NDVI values: 707662
Date: 2017-03-11T00:00:00.000000000 - Number of non-NaN NDVI values: 617145
Date: 2017-03-16T00:00:00.000000000 - Number of non-NaN NDVI values: 601657
Date: 2017-03-21T00:00:00.000000000 - Number of non-NaN NDVI values: 297106
Date: 2017-03-26T00:00:00.000000000 - Number of non-NaN NDVI values: 512963
Date: 2017-03-31T00:00:00.000000000 - Number of non-NaN NDVI values: 460329
Date: 2017-04-05T00:00:00.000000000 - Number of non-NaN NDVI values: 641623
Date: 2017-04-10T00:00:00.000000000 - Number of non-NaN NDVI values: 186476
Date: 2017-04-15T00:00:00.000000000 - Number of non-NaN NDVI values: 864421
Date: 2017-04-20T00:00:00.000000000 - Number of non-NaN NDVI values: 539825
Date: 2017-04-25T00:00:00.000000000 - Number of non-NaN NDVI values: 332112
Date: 2017-04-30T00:00:00.000000000 - Number of non-NaN NDVI values: 413190
Date: 2017-05-05T00:00:00.000000000 - Number of non-NaN NDVI values: 643990
Date: 2017-05-10T00:00:00.000000000 - Number of non-NaN NDVI values: 527493
Date: 2017-05-15T00:00:00.000000000 - Number of non-NaN NDVI values: 668972
Date: 2017-05-20T00:00:00.000000000 - Number of non-NaN NDVI values: 418498
Date: 2017-05-25T00:00:00.000000000 - Number of non-NaN NDVI values: 668569
Date: 2017-05-30T00:00:00.000000000 - Number of non-NaN NDVI values: 440164
Date: 2017-06-04T00:00:00.000000000 - Number of non-NaN NDVI values: 838649
Date: 2017-06-09T00:00:00.000000000 - Number of non-NaN NDVI values: 400726
Date: 2017-06-14T00:00:00.000000000 - Number of non-NaN NDVI values: 802162
Date: 2017-06-19T00:00:00.000000000 - Number of non-NaN NDVI values: 576634
Date: 2017-06-24T00:00:00.000000000 - Number of non-NaN NDVI values: 1064392
Date: 2017-06-29T00:00:00.000000000 - Number of non-NaN NDVI values: 1396831
Date: 2017-07-04T00:00:00.000000000 - Number of non-NaN NDVI values: 904167
Date: 2017-07-09T00:00:00.000000000 - Number of non-NaN NDVI values: 1014362
Date: 2017-07-14T00:00:00.000000000 - Number of non-NaN NDVI values: 1235566
Date: 2017-07-19T00:00:00.000000000 - Number of non-NaN NDVI values: 1039258
Date: 2017-07-24T00:00:00.000000000 - Number of non-NaN NDVI values: 801631
Date: 2017-07-29T00:00:00.000000000 - Number of non-NaN NDVI values: 964584
Date: 2017-08-03T00:00:00.000000000 - Number of non-NaN NDVI values: 711300
Date: 2017-08-08T00:00:00.000000000 - Number of non-NaN NDVI values: 794347
Date: 2017-08-13T00:00:00.000000000 - Number of non-NaN NDVI values: 980194
Date: 2017-08-18T00:00:00.000000000 - Number of non-NaN NDVI values: 921815
Date: 2017-08-23T00:00:00.000000000 - Number of non-NaN NDVI values: 1200440
Date: 2017-08-28T00:00:00.000000000 - Number of non-NaN NDVI values: 625530
Date: 2017-09-02T00:00:00.000000000 - Number of non-NaN NDVI values: 874742
Date: 2017-09-07T00:00:00.000000000 - Number of non-NaN NDVI values: 1180835
Date: 2017-09-12T00:00:00.000000000 - Number of non-NaN NDVI values: 1023301
Date: 2017-09-17T00:00:00.000000000 - Number of non-NaN NDVI values: 265021
Date: 2017-09-22T00:00:00.000000000 - Number of non-NaN NDVI values: 560183
Date: 2017-09-27T00:00:00.000000000 - Number of non-NaN NDVI values: 971147
Date: 2017-10-02T00:00:00.000000000 - Number of non-NaN NDVI values: 995119
Date: 2017-10-07T00:00:00.000000000 - Number of non-NaN NDVI values: 1150902
Date: 2017-10-12T00:00:00.000000000 - Number of non-NaN NDVI values: 983103
Date: 2017-10-17T00:00:00.000000000 - Number of non-NaN NDVI values: 1193391
Date: 2017-10-22T00:00:00.000000000 - Number of non-NaN NDVI values: 1383536
Date: 2017-10-27T00:00:00.000000000 - Number of non-NaN NDVI values: 969612
Date: 2017-11-01T00:00:00.000000000 - Number of non-NaN NDVI values: 869122
Date: 2017-11-06T00:00:00.000000000 - Number of non-NaN NDVI values: 1040188
Date: 2017-11-11T00:00:00.000000000 - Number of non-NaN NDVI values: 1093029
Date: 2017-11-16T00:00:00.000000000 - Number of non-NaN NDVI values: 1307050
Date: 2017-11-21T00:00:00.000000000 - Number of non-NaN NDVI values: 842467
Date: 2017-11-26T00:00:00.000000000 - Number of non-NaN NDVI values: 1031091
Date: 2017-12-01T00:00:00.000000000 - Number of non-NaN NDVI values: 700918
Date: 2017-12-06T00:00:00.000000000 - Number of non-NaN NDVI values: 1223201
Date: 2017-12-11T00:00:00.000000000 - Number of non-NaN NDVI values: 1007335
Date: 2017-12-16T00:00:00.000000000 - Number of non-NaN NDVI values: 1116969
Date: 2017-12-21T00:00:00.000000000 - Number of non-NaN NDVI values: 895879
Date: 2017-12-26T00:00:00.000000000 - Number of non-NaN NDVI values: 1105935
Date: 2017-12-31T00:00:00.000000000 - Number of non-NaN NDVI values: 640445
Date: 2018-01-05T00:00:00.000000000 - Number of non-NaN NDVI values: 895603
Date: 2018-01-10T00:00:00.000000000 - Number of non-NaN NDVI values: 1127206
Date: 2018-01-15T00:00:00.000000000 - Number of non-NaN NDVI values: 838006
Date: 2018-01-20T00:00:00.000000000 - Number of non-NaN NDVI values: 1196178
Date: 2018-01-25T00:00:00.000000000 - Number of non-NaN NDVI values: 1099987
Date: 2018-01-30T00:00:00.000000000 - Number of non-NaN NDVI values: 855080
Date: 2018-02-04T00:00:00.000000000 - Number of non-NaN NDVI values: 1167105
Date: 2018-02-09T00:00:00.000000000 - Number of non-NaN NDVI values: 1214804
Date: 2018-02-14T00:00:00.000000000 - Number of non-NaN NDVI values: 1003651
Date: 2018-02-19T00:00:00.000000000 - Number of non-NaN NDVI values: 1036841
Date: 2018-02-24T00:00:00.000000000 - Number of non-NaN NDVI values: 1088853
Date: 2018-03-01T00:00:00.000000000 - Number of non-NaN NDVI values: 1247952
Date: 2018-03-06T00:00:00.000000000 - Number of non-NaN NDVI values: 1067825
Date: 2018-03-11T00:00:00.000000000 - Number of non-NaN NDVI values: 1000610
Date: 2018-03-16T00:00:00.000000000 - Number of non-NaN NDVI values: 1011396
Date: 2018-03-21T00:00:00.000000000 - Number of non-NaN NDVI values: 753163
Date: 2018-03-26T00:00:00.000000000 - Number of non-NaN NDVI values: 1265163
Date: 2018-03-31T00:00:00.000000000 - Number of non-NaN NDVI values: 1265194
Date: 2018-04-05T00:00:00.000000000 - Number of non-NaN NDVI values: 726434
Date: 2018-04-10T00:00:00.000000000 - Number of non-NaN NDVI values: 1166807
Date: 2018-04-15T00:00:00.000000000 - Number of non-NaN NDVI values: 1033506
Date: 2018-04-20T00:00:00.000000000 - Number of non-NaN NDVI values: 1250892
Date: 2018-04-25T00:00:00.000000000 - Number of non-NaN NDVI values: 1450697
Date: 2018-04-30T00:00:00.000000000 - Number of non-NaN NDVI values: 1147316
Date: 2018-05-05T00:00:00.000000000 - Number of non-NaN NDVI values: 1359881
Date: 2018-05-10T00:00:00.000000000 - Number of non-NaN NDVI values: 1450309
Date: 2018-05-15T00:00:00.000000000 - Number of non-NaN NDVI values: 1416305
Date: 2018-05-20T00:00:00.000000000 - Number of non-NaN NDVI values: 1326400
Date: 2018-05-25T00:00:00.000000000 - Number of non-NaN NDVI values: 1309329
Date: 2018-05-30T00:00:00.000000000 - Number of non-NaN NDVI values: 1450126
Date: 2018-06-04T00:00:00.000000000 - Number of non-NaN NDVI values: 1436022
Date: 2018-06-09T00:00:00.000000000 - Number of non-NaN NDVI values: 1461019
Date: 2018-06-14T00:00:00.000000000 - Number of non-NaN NDVI values: 1317819
Date: 2018-06-19T00:00:00.000000000 - Number of non-NaN NDVI values: 1449900
Date: 2018-06-24T00:00:00.000000000 - Number of non-NaN NDVI values: 1460091
Date: 2018-06-29T00:00:00.000000000 - Number of non-NaN NDVI values: 1351880
Date: 2018-07-04T00:00:00.000000000 - Number of non-NaN NDVI values: 1167428
Date: 2018-07-09T00:00:00.000000000 - Number of non-NaN NDVI values: 1209222
Date: 2018-07-14T00:00:00.000000000 - Number of non-NaN NDVI values: 1065911
Date: 2018-07-19T00:00:00.000000000 - Number of non-NaN NDVI values: 1393458
Date: 2018-07-24T00:00:00.000000000 - Number of non-NaN NDVI values: 1293343
Date: 2018-07-29T00:00:00.000000000 - Number of non-NaN NDVI values: 1304142
Date: 2018-08-03T00:00:00.000000000 - Number of non-NaN NDVI values: 1403274
Date: 2018-08-08T00:00:00.000000000 - Number of non-NaN NDVI values: 1253138
Date: 2018-08-13T00:00:00.000000000 - Number of non-NaN NDVI values: 1267069
Date: 2018-08-18T00:00:00.000000000 - Number of non-NaN NDVI values: 1309457
Date: 2018-08-23T00:00:00.000000000 - Number of non-NaN NDVI values: 1184789
Date: 2018-08-28T00:00:00.000000000 - Number of non-NaN NDVI values: 1175814
Date: 2018-09-02T00:00:00.000000000 - Number of non-NaN NDVI values: 1124725
Date: 2018-09-07T00:00:00.000000000 - Number of non-NaN NDVI values: 1296429
Date: 2018-09-12T00:00:00.000000000 - Number of non-NaN NDVI values: 1467258
Date: 2018-09-17T00:00:00.000000000 - Number of non-NaN NDVI values: 1264949
Date: 2018-09-22T00:00:00.000000000 - Number of non-NaN NDVI values: 1331015
Date: 2018-09-27T00:00:00.000000000 - Number of non-NaN NDVI values: 1311157
Date: 2018-10-02T00:00:00.000000000 - Number of non-NaN NDVI values: 877577
Date: 2018-10-07T00:00:00.000000000 - Number of non-NaN NDVI values: 842026
Date: 2018-10-12T00:00:00.000000000 - Number of non-NaN NDVI values: 946839
Date: 2018-10-17T00:00:00.000000000 - Number of non-NaN NDVI values: 1086916
Date: 2018-10-22T00:00:00.000000000 - Number of non-NaN NDVI values: 1290187
Date: 2018-10-27T00:00:00.000000000 - Number of non-NaN NDVI values: 1201566
Date: 2018-11-01T00:00:00.000000000 - Number of non-NaN NDVI values: 1211751
Date: 2018-11-06T00:00:00.000000000 - Number of non-NaN NDVI values: 1391392
Date: 2018-11-11T00:00:00.000000000 - Number of non-NaN NDVI values: 1180698
Date: 2018-11-16T00:00:00.000000000 - Number of non-NaN NDVI values: 1206150
Date: 2018-11-21T00:00:00.000000000 - Number of non-NaN NDVI values: 1219878
Date: 2018-11-26T00:00:00.000000000 - Number of non-NaN NDVI values: 997135
Date: 2018-12-01T00:00:00.000000000 - Number of non-NaN NDVI values: 913512
Date: 2018-12-06T00:00:00.000000000 - Number of non-NaN NDVI values: 1109780
Date: 2018-12-11T00:00:00.000000000 - Number of non-NaN NDVI values: 1284869
Date: 2018-12-16T00:00:00.000000000 - Number of non-NaN NDVI values: 1276355
Date: 2018-12-21T00:00:00.000000000 - Number of non-NaN NDVI values: 842122
Date: 2018-12-26T00:00:00.000000000 - Number of non-NaN NDVI values: 1041945
Date: 2018-12-31T00:00:00.000000000 - Number of non-NaN NDVI values: 1402546
Date: 2019-01-05T00:00:00.000000000 - Number of non-NaN NDVI values: 766200
Date: 2019-01-10T00:00:00.000000000 - Number of non-NaN NDVI values: 572778
Date: 2019-01-15T00:00:00.000000000 - Number of non-NaN NDVI values: 1007771
Date: 2019-01-20T00:00:00.000000000 - Number of non-NaN NDVI values: 1217585
Date: 2019-01-25T00:00:00.000000000 - Number of non-NaN NDVI values: 1107193
Date: 2019-01-30T00:00:00.000000000 - Number of non-NaN NDVI values: 1194286
Date: 2019-02-04T00:00:00.000000000 - Number of non-NaN NDVI values: 989976
Date: 2019-02-09T00:00:00.000000000 - Number of non-NaN NDVI values: 998637
Date: 2019-02-14T00:00:00.000000000 - Number of non-NaN NDVI values: 489968
Date: 2019-02-19T00:00:00.000000000 - Number of non-NaN NDVI values: 709547
Date: 2019-02-24T00:00:00.000000000 - Number of non-NaN NDVI values: 900770
Date: 2019-03-01T00:00:00.000000000 - Number of non-NaN NDVI values: 729155
Date: 2019-03-06T00:00:00.000000000 - Number of non-NaN NDVI values: 824081
Date: 2019-03-11T00:00:00.000000000 - Number of non-NaN NDVI values: 971608
Date: 2019-03-16T00:00:00.000000000 - Number of non-NaN NDVI values: 1388846
Date: 2019-03-21T00:00:00.000000000 - Number of non-NaN NDVI values: 809416
Date: 2019-03-26T00:00:00.000000000 - Number of non-NaN NDVI values: 1224888
Date: 2019-03-31T00:00:00.000000000 - Number of non-NaN NDVI values: 1188069
Date: 2019-04-05T00:00:00.000000000 - Number of non-NaN NDVI values: 1323358
Date: 2019-04-10T00:00:00.000000000 - Number of non-NaN NDVI values: 1162629
Date: 2019-04-15T00:00:00.000000000 - Number of non-NaN NDVI values: 1192496
Date: 2019-04-20T00:00:00.000000000 - Number of non-NaN NDVI values: 1152134
Date: 2019-04-25T00:00:00.000000000 - Number of non-NaN NDVI values: 1308515
Date: 2019-04-30T00:00:00.000000000 - Number of non-NaN NDVI values: 1377644
Date: 2019-05-05T00:00:00.000000000 - Number of non-NaN NDVI values: 1063192
Date: 2019-05-10T00:00:00.000000000 - Number of non-NaN NDVI values: 1331953
Date: 2019-05-15T00:00:00.000000000 - Number of non-NaN NDVI values: 909097
Date: 2019-05-20T00:00:00.000000000 - Number of non-NaN NDVI values: 896371
Date: 2019-05-25T00:00:00.000000000 - Number of non-NaN NDVI values: 1216011
Date: 2019-05-30T00:00:00.000000000 - Number of non-NaN NDVI values: 1294520
Date: 2019-06-04T00:00:00.000000000 - Number of non-NaN NDVI values: 1499912
Date: 2019-06-09T00:00:00.000000000 - Number of non-NaN NDVI values: 1300057
Date: 2019-06-14T00:00:00.000000000 - Number of non-NaN NDVI values: 1476098
Date: 2019-06-19T00:00:00.000000000 - Number of non-NaN NDVI values: 1486866
Date: 2019-06-24T00:00:00.000000000 - Number of non-NaN NDVI values: 1466269
Date: 2019-06-29T00:00:00.000000000 - Number of non-NaN NDVI values: 1393771
Date: 2019-07-04T00:00:00.000000000 - Number of non-NaN NDVI values: 1408219
Date: 2019-07-09T00:00:00.000000000 - Number of non-NaN NDVI values: 1527057
Date: 2019-07-14T00:00:00.000000000 - Number of non-NaN NDVI values: 1275057
Date: 2019-07-19T00:00:00.000000000 - Number of non-NaN NDVI values: 1279632
Date: 2019-07-24T00:00:00.000000000 - Number of non-NaN NDVI values: 1302231
Date: 2019-07-29T00:00:00.000000000 - Number of non-NaN NDVI values: 1304670
Date: 2019-08-03T00:00:00.000000000 - Number of non-NaN NDVI values: 1193378
Date: 2019-08-08T00:00:00.000000000 - Number of non-NaN NDVI values: 1208343
Date: 2019-08-13T00:00:00.000000000 - Number of non-NaN NDVI values: 1463950
Date: 2019-08-18T00:00:00.000000000 - Number of non-NaN NDVI values: 1440332
Date: 2019-08-23T00:00:00.000000000 - Number of non-NaN NDVI values: 1417718
Date: 2019-08-28T00:00:00.000000000 - Number of non-NaN NDVI values: 1319719
Date: 2019-09-02T00:00:00.000000000 - Number of non-NaN NDVI values: 1367711
Date: 2019-09-07T00:00:00.000000000 - Number of non-NaN NDVI values: 1224311
Date: 2019-09-12T00:00:00.000000000 - Number of non-NaN NDVI values: 1270193
Date: 2019-09-17T00:00:00.000000000 - Number of non-NaN NDVI values: 1275820
Date: 2019-09-22T00:00:00.000000000 - Number of non-NaN NDVI values: 1532260
Date: 2019-09-27T00:00:00.000000000 - Number of non-NaN NDVI values: 1266551
Date: 2019-10-02T00:00:00.000000000 - Number of non-NaN NDVI values: 1370441
Date: 2019-10-07T00:00:00.000000000 - Number of non-NaN NDVI values: 1414227
Date: 2019-10-12T00:00:00.000000000 - Number of non-NaN NDVI values: 1533796
Date: 2019-10-17T00:00:00.000000000 - Number of non-NaN NDVI values: 1475055
Date: 2019-10-22T00:00:00.000000000 - Number of non-NaN NDVI values: 1372160
Date: 2019-10-27T00:00:00.000000000 - Number of non-NaN NDVI values: 1337412
Date: 2019-11-01T00:00:00.000000000 - Number of non-NaN NDVI values: 1430690
Date: 2019-11-06T00:00:00.000000000 - Number of non-NaN NDVI values: 1202601
Date: 2019-11-11T00:00:00.000000000 - Number of non-NaN NDVI values: 1178391
Date: 2019-11-16T00:00:00.000000000 - Number of non-NaN NDVI values: 1068666
Date: 2019-11-21T00:00:00.000000000 - Number of non-NaN NDVI values: 1107350
Date: 2019-11-26T00:00:00.000000000 - Number of non-NaN NDVI values: 626652
Date: 2019-12-01T00:00:00.000000000 - Number of non-NaN NDVI values: 1035307
Date: 2019-12-06T00:00:00.000000000 - Number of non-NaN NDVI values: 839038
Date: 2019-12-11T00:00:00.000000000 - Number of non-NaN NDVI values: 1140745
Date: 2019-12-16T00:00:00.000000000 - Number of non-NaN NDVI values: 1264021
Date: 2019-12-21T00:00:00.000000000 - Number of non-NaN NDVI values: 954351
Date: 2019-12-26T00:00:00.000000000 - Number of non-NaN NDVI values: 1213662
Date: 2019-12-31T00:00:00.000000000 - Number of non-NaN NDVI values: 1039856
Date: 2020-01-05T00:00:00.000000000 - Number of non-NaN NDVI values: 1144339
Date: 2020-01-10T00:00:00.000000000 - Number of non-NaN NDVI values: 1292090
Date: 2020-01-15T00:00:00.000000000 - Number of non-NaN NDVI values: 910265
Date: 2020-01-20T00:00:00.000000000 - Number of non-NaN NDVI values: 1120426
Date: 2020-01-25T00:00:00.000000000 - Number of non-NaN NDVI values: 994905
Date: 2020-01-30T00:00:00.000000000 - Number of non-NaN NDVI values: 1256509
Date: 2020-02-04T00:00:00.000000000 - Number of non-NaN NDVI values: 1197279
Date: 2020-02-09T00:00:00.000000000 - Number of non-NaN NDVI values: 1106574
Date: 2020-02-14T00:00:00.000000000 - Number of non-NaN NDVI values: 1169250
Date: 2020-02-19T00:00:00.000000000 - Number of non-NaN NDVI values: 854661
Date: 2020-02-24T00:00:00.000000000 - Number of non-NaN NDVI values: 1367902
Date: 2020-02-29T00:00:00.000000000 - Number of non-NaN NDVI values: 1191210
Date: 2020-03-05T00:00:00.000000000 - Number of non-NaN NDVI values: 936542
Date: 2020-03-10T00:00:00.000000000 - Number of non-NaN NDVI values: 664485
Date: 2020-03-15T00:00:00.000000000 - Number of non-NaN NDVI values: 638823
Date: 2020-03-20T00:00:00.000000000 - Number of non-NaN NDVI values: 893348
Date: 2020-03-25T00:00:00.000000000 - Number of non-NaN NDVI values: 722596
Date: 2020-03-30T00:00:00.000000000 - Number of non-NaN NDVI values: 1304162
Date: 2020-04-04T00:00:00.000000000 - Number of non-NaN NDVI values: 1150964
Date: 2020-04-09T00:00:00.000000000 - Number of non-NaN NDVI values: 830741
Date: 2020-04-14T00:00:00.000000000 - Number of non-NaN NDVI values: 884877
Date: 2020-04-19T00:00:00.000000000 - Number of non-NaN NDVI values: 1015297
Date: 2020-04-24T00:00:00.000000000 - Number of non-NaN NDVI values: 1298293
Date: 2020-04-29T00:00:00.000000000 - Number of non-NaN NDVI values: 1476600
Date: 2020-05-04T00:00:00.000000000 - Number of non-NaN NDVI values: 1377043
Date: 2020-05-09T00:00:00.000000000 - Number of non-NaN NDVI values: 1313016
Date: 2020-05-14T00:00:00.000000000 - Number of non-NaN NDVI values: 1215302
Date: 2020-05-19T00:00:00.000000000 - Number of non-NaN NDVI values: 1430985
Date: 2020-05-24T00:00:00.000000000 - Number of non-NaN NDVI values: 1420325
Date: 2020-05-29T00:00:00.000000000 - Number of non-NaN NDVI values: 1123239
Date: 2020-06-03T00:00:00.000000000 - Number of non-NaN NDVI values: 1265310
Date: 2020-06-08T00:00:00.000000000 - Number of non-NaN NDVI values: 1492647
Date: 2020-06-13T00:00:00.000000000 - Number of non-NaN NDVI values: 1310120
Date: 2020-06-18T00:00:00.000000000 - Number of non-NaN NDVI values: 1530476
Date: 2020-06-23T00:00:00.000000000 - Number of non-NaN NDVI values: 1320043
Date: 2020-06-28T00:00:00.000000000 - Number of non-NaN NDVI values: 1396954
Date: 2020-07-03T00:00:00.000000000 - Number of non-NaN NDVI values: 1350658
Date: 2020-07-08T00:00:00.000000000 - Number of non-NaN NDVI values: 1376557
Date: 2020-07-13T00:00:00.000000000 - Number of non-NaN NDVI values: 1220621
Date: 2020-07-18T00:00:00.000000000 - Number of non-NaN NDVI values: 1125643
Date: 2020-07-23T00:00:00.000000000 - Number of non-NaN NDVI values: 953489
Date: 2020-07-28T00:00:00.000000000 - Number of non-NaN NDVI values: 1419714
Date: 2020-08-02T00:00:00.000000000 - Number of non-NaN NDVI values: 1421451
Date: 2020-08-07T00:00:00.000000000 - Number of non-NaN NDVI values: 1328570
Date: 2020-08-12T00:00:00.000000000 - Number of non-NaN NDVI values: 1309194
Date: 2020-08-17T00:00:00.000000000 - Number of non-NaN NDVI values: 1359476
Date: 2020-08-22T00:00:00.000000000 - Number of non-NaN NDVI values: 1393558
Date: 2020-08-27T00:00:00.000000000 - Number of non-NaN NDVI values: 1336536
Date: 2020-09-01T00:00:00.000000000 - Number of non-NaN NDVI values: 1426648
Date: 2020-09-06T00:00:00.000000000 - Number of non-NaN NDVI values: 1178498
Date: 2020-09-11T00:00:00.000000000 - Number of non-NaN NDVI values: 1284034
Date: 2020-09-16T00:00:00.000000000 - Number of non-NaN NDVI values: 1338585
Date: 2020-09-21T00:00:00.000000000 - Number of non-NaN NDVI values: 1427975
Date: 2020-09-26T00:00:00.000000000 - Number of non-NaN NDVI values: 1398051
Date: 2020-10-01T00:00:00.000000000 - Number of non-NaN NDVI values: 1322598
Date: 2020-10-06T00:00:00.000000000 - Number of non-NaN NDVI values: 1301989
Date: 2020-10-11T00:00:00.000000000 - Number of non-NaN NDVI values: 1450704
Date: 2020-10-16T00:00:00.000000000 - Number of non-NaN NDVI values: 1411401
Date: 2020-10-21T00:00:00.000000000 - Number of non-NaN NDVI values: 1209509
Date: 2020-10-26T00:00:00.000000000 - Number of non-NaN NDVI values: 1298224
Date: 2020-10-31T00:00:00.000000000 - Number of non-NaN NDVI values: 1208211
Date: 2020-11-05T00:00:00.000000000 - Number of non-NaN NDVI values: 1077463
Date: 2020-11-10T00:00:00.000000000 - Number of non-NaN NDVI values: 1332641
Date: 2020-11-15T00:00:00.000000000 - Number of non-NaN NDVI values: 1238394
Date: 2020-11-20T00:00:00.000000000 - Number of non-NaN NDVI values: 976203
Date: 2020-11-25T00:00:00.000000000 - Number of non-NaN NDVI values: 1442804
Date: 2020-11-30T00:00:00.000000000 - Number of non-NaN NDVI values: 1468960
Date: 2020-12-05T00:00:00.000000000 - Number of non-NaN NDVI values: 1291332
Date: 2020-12-10T00:00:00.000000000 - Number of non-NaN NDVI values: 783388
Date: 2020-12-15T00:00:00.000000000 - Number of non-NaN NDVI values: 1309828
Date: 2020-12-20T00:00:00.000000000 - Number of non-NaN NDVI values: 1267438
Date: 2020-12-25T00:00:00.000000000 - Number of non-NaN NDVI values: 1025053
Date: 2020-12-30T00:00:00.000000000 - Number of non-NaN NDVI values: 1428678
Date: 2021-01-04T00:00:00.000000000 - Number of non-NaN NDVI values: 1354112
Date: 2021-01-09T00:00:00.000000000 - Number of non-NaN NDVI values: 1294689
Date: 2021-01-14T00:00:00.000000000 - Number of non-NaN NDVI values: 1265808
Date: 2021-01-19T00:00:00.000000000 - Number of non-NaN NDVI values: 1047103
Date: 2021-01-24T00:00:00.000000000 - Number of non-NaN NDVI values: 849325
Date: 2021-01-29T00:00:00.000000000 - Number of non-NaN NDVI values: 1155481
Date: 2021-02-03T00:00:00.000000000 - Number of non-NaN NDVI values: 1314973
Date: 2021-02-08T00:00:00.000000000 - Number of non-NaN NDVI values: 1222713
Date: 2021-02-13T00:00:00.000000000 - Number of non-NaN NDVI values: 655904
Date: 2021-02-18T00:00:00.000000000 - Number of non-NaN NDVI values: 1278593
Date: 2021-02-23T00:00:00.000000000 - Number of non-NaN NDVI values: 1235474
Date: 2021-02-28T00:00:00.000000000 - Number of non-NaN NDVI values: 1326062
Date: 2021-03-05T00:00:00.000000000 - Number of non-NaN NDVI values: 1105634
Date: 2021-03-10T00:00:00.000000000 - Number of non-NaN NDVI values: 1051033
Date: 2021-03-15T00:00:00.000000000 - Number of non-NaN NDVI values: 1129931
Date: 2021-03-20T00:00:00.000000000 - Number of non-NaN NDVI values: 810931
Date: 2021-03-25T00:00:00.000000000 - Number of non-NaN NDVI values: 1385991
Date: 2021-03-30T00:00:00.000000000 - Number of non-NaN NDVI values: 1366627
Date: 2021-04-04T00:00:00.000000000 - Number of non-NaN NDVI values: 1347199
Date: 2021-04-09T00:00:00.000000000 - Number of non-NaN NDVI values: 1272275
Date: 2021-04-14T00:00:00.000000000 - Number of non-NaN NDVI values: 1160426
Date: 2021-04-19T00:00:00.000000000 - Number of non-NaN NDVI values: 1100122
Date: 2021-04-24T00:00:00.000000000 - Number of non-NaN NDVI values: 1220997
Date: 2021-04-29T00:00:00.000000000 - Number of non-NaN NDVI values: 1184735
Date: 2021-05-04T00:00:00.000000000 - Number of non-NaN NDVI values: 1507479
Date: 2021-05-09T00:00:00.000000000 - Number of non-NaN NDVI values: 1369440
Date: 2021-05-14T00:00:00.000000000 - Number of non-NaN NDVI values: 1298532
Date: 2021-05-19T00:00:00.000000000 - Number of non-NaN NDVI values: 1284081
Date: 2021-05-24T00:00:00.000000000 - Number of non-NaN NDVI values: 1475327
Date: 2021-05-29T00:00:00.000000000 - Number of non-NaN NDVI values: 1242700
Date: 2021-06-03T00:00:00.000000000 - Number of non-NaN NDVI values: 1249449
Date: 2021-06-08T00:00:00.000000000 - Number of non-NaN NDVI values: 1468316
Date: 2021-06-13T00:00:00.000000000 - Number of non-NaN NDVI values: 1488158
Date: 2021-06-18T00:00:00.000000000 - Number of non-NaN NDVI values: 1420875
Date: 2021-06-23T00:00:00.000000000 - Number of non-NaN NDVI values: 938676
Date: 2021-06-28T00:00:00.000000000 - Number of non-NaN NDVI values: 1108215
Date: 2021-07-03T00:00:00.000000000 - Number of non-NaN NDVI values: 1325758
Date: 2021-07-08T00:00:00.000000000 - Number of non-NaN NDVI values: 1397584
Date: 2021-07-13T00:00:00.000000000 - Number of non-NaN NDVI values: 1312019
Date: 2021-07-18T00:00:00.000000000 - Number of non-NaN NDVI values: 1160595
Date: 2021-07-23T00:00:00.000000000 - Number of non-NaN NDVI values: 1072330
Date: 2021-07-28T00:00:00.000000000 - Number of non-NaN NDVI values: 1233585
Date: 2021-08-02T00:00:00.000000000 - Number of non-NaN NDVI values: 1425644
Date: 2021-08-07T00:00:00.000000000 - Number of non-NaN NDVI values: 1189717
Date: 2021-08-12T00:00:00.000000000 - Number of non-NaN NDVI values: 1130541
Date: 2021-08-17T00:00:00.000000000 - Number of non-NaN NDVI values: 1244330
Date: 2021-08-22T00:00:00.000000000 - Number of non-NaN NDVI values: 1429885
Date: 2021-08-27T00:00:00.000000000 - Number of non-NaN NDVI values: 1400988
Date: 2021-09-01T00:00:00.000000000 - Number of non-NaN NDVI values: 1235652
Date: 2021-09-06T00:00:00.000000000 - Number of non-NaN NDVI values: 1490848
Date: 2021-09-11T00:00:00.000000000 - Number of non-NaN NDVI values: 1309929
Date: 2021-09-16T00:00:00.000000000 - Number of non-NaN NDVI values: 1493453
Date: 2021-09-21T00:00:00.000000000 - Number of non-NaN NDVI values: 1352477
Date: 2021-09-26T00:00:00.000000000 - Number of non-NaN NDVI values: 850782
Date: 2021-10-01T00:00:00.000000000 - Number of non-NaN NDVI values: 1189767
Date: 2021-10-06T00:00:00.000000000 - Number of non-NaN NDVI values: 946703
Date: 2021-10-11T00:00:00.000000000 - Number of non-NaN NDVI values: 1308464
Date: 2021-10-16T00:00:00.000000000 - Number of non-NaN NDVI values: 1324177
Date: 2021-10-21T00:00:00.000000000 - Number of non-NaN NDVI values: 840014
Date: 2021-10-26T00:00:00.000000000 - Number of non-NaN NDVI values: 1342931
Date: 2021-10-31T00:00:00.000000000 - Number of non-NaN NDVI values: 1108334
Date: 2021-11-05T00:00:00.000000000 - Number of non-NaN NDVI values: 1184649
Date: 2021-11-10T00:00:00.000000000 - Number of non-NaN NDVI values: 1323340
Date: 2021-11-15T00:00:00.000000000 - Number of non-NaN NDVI values: 1204549
Date: 2021-11-20T00:00:00.000000000 - Number of non-NaN NDVI values: 704829
Date: 2021-11-25T00:00:00.000000000 - Number of non-NaN NDVI values: 1371307
Date: 2021-11-30T00:00:00.000000000 - Number of non-NaN NDVI values: 1298220
Date: 2021-12-05T00:00:00.000000000 - Number of non-NaN NDVI values: 1085287
Date: 2021-12-10T00:00:00.000000000 - Number of non-NaN NDVI values: 1110481
Date: 2021-12-15T00:00:00.000000000 - Number of non-NaN NDVI values: 1208832
Date: 2021-12-20T00:00:00.000000000 - Number of non-NaN NDVI values: 904836
Date: 2021-12-25T00:00:00.000000000 - Number of non-NaN NDVI values: 574242
Date: 2021-12-30T00:00:00.000000000 - Number of non-NaN NDVI values: 1169220
Date: 2022-01-04T00:00:00.000000000 - Number of non-NaN NDVI values: 1189770
Date: 2022-01-09T00:00:00.000000000 - Number of non-NaN NDVI values: 1309437
Date: 2022-01-14T00:00:00.000000000 - Number of non-NaN NDVI values: 1278740
Date: 2022-01-19T00:00:00.000000000 - Number of non-NaN NDVI values: 1296185
Date: 2022-01-24T00:00:00.000000000 - Number of non-NaN NDVI values: 1404603
Date: 2022-01-29T00:00:00.000000000 - Number of non-NaN NDVI values: 1021273
Date: 2022-02-03T00:00:00.000000000 - Number of non-NaN NDVI values: 1339798
Date: 2022-02-08T00:00:00.000000000 - Number of non-NaN NDVI values: 1361600
Date: 2022-02-13T00:00:00.000000000 - Number of non-NaN NDVI values: 1395969
Date: 2022-02-18T00:00:00.000000000 - Number of non-NaN NDVI values: 1226100
Date: 2022-02-23T00:00:00.000000000 - Number of non-NaN NDVI values: 1288531
Date: 2022-02-28T00:00:00.000000000 - Number of non-NaN NDVI values: 1384529
Date: 2022-03-05T00:00:00.000000000 - Number of non-NaN NDVI values: 1035053
Date: 2022-03-10T00:00:00.000000000 - Number of non-NaN NDVI values: 1245686
Date: 2022-03-15T00:00:00.000000000 - Number of non-NaN NDVI values: 1231153
Date: 2022-03-20T00:00:00.000000000 - Number of non-NaN NDVI values: 1341335
Date: 2022-03-25T00:00:00.000000000 - Number of non-NaN NDVI values: 1199186
Date: 2022-03-30T00:00:00.000000000 - Number of non-NaN NDVI values: 1186419
Date: 2022-04-04T00:00:00.000000000 - Number of non-NaN NDVI values: 1534061
Date: 2022-04-09T00:00:00.000000000 - Number of non-NaN NDVI values: 1214160
Date: 2022-04-14T00:00:00.000000000 - Number of non-NaN NDVI values: 1230675
Date: 2022-04-19T00:00:00.000000000 - Number of non-NaN NDVI values: 1014514
Date: 2022-04-24T00:00:00.000000000 - Number of non-NaN NDVI values: 1256100
Date: 2022-04-29T00:00:00.000000000 - Number of non-NaN NDVI values: 1460838
Date: 2022-05-04T00:00:00.000000000 - Number of non-NaN NDVI values: 1197087
Date: 2022-05-09T00:00:00.000000000 - Number of non-NaN NDVI values: 1476932
Date: 2022-05-14T00:00:00.000000000 - Number of non-NaN NDVI values: 1490087
Date: 2022-05-19T00:00:00.000000000 - Number of non-NaN NDVI values: 1262806
Date: 2022-05-24T00:00:00.000000000 - Number of non-NaN NDVI values: 1448367
Date: 2022-05-29T00:00:00.000000000 - Number of non-NaN NDVI values: 1476842
Date: 2022-06-03T00:00:00.000000000 - Number of non-NaN NDVI values: 1192456
Date: 2022-06-08T00:00:00.000000000 - Number of non-NaN NDVI values: 1402978
Date: 2022-06-13T00:00:00.000000000 - Number of non-NaN NDVI values: 1453343
Date: 2022-06-18T00:00:00.000000000 - Number of non-NaN NDVI values: 1217817
Date: 2022-06-23T00:00:00.000000000 - Number of non-NaN NDVI values: 1181398
Date: 2022-06-28T00:00:00.000000000 - Number of non-NaN NDVI values: 1409386
Date: 2022-07-03T00:00:00.000000000 - Number of non-NaN NDVI values: 1307337
Date: 2022-07-08T00:00:00.000000000 - Number of non-NaN NDVI values: 1471812
Date: 2022-07-13T00:00:00.000000000 - Number of non-NaN NDVI values: 1392438
Date: 2022-07-18T00:00:00.000000000 - Number of non-NaN NDVI values: 1367636
Date: 2022-07-23T00:00:00.000000000 - Number of non-NaN NDVI values: 1298967
Date: 2022-07-28T00:00:00.000000000 - Number of non-NaN NDVI values: 975686
Date: 2022-08-02T00:00:00.000000000 - Number of non-NaN NDVI values: 1227866
Date: 2022-08-07T00:00:00.000000000 - Number of non-NaN NDVI values: 1220408
Date: 2022-08-12T00:00:00.000000000 - Number of non-NaN NDVI values: 1253823
Date: 2022-08-17T00:00:00.000000000 - Number of non-NaN NDVI values: 971412
Date: 2022-08-22T00:00:00.000000000 - Number of non-NaN NDVI values: 1262454
Date: 2022-08-27T00:00:00.000000000 - Number of non-NaN NDVI values: 1391227
Date: 2022-09-01T00:00:00.000000000 - Number of non-NaN NDVI values: 1434055
Date: 2022-09-06T00:00:00.000000000 - Number of non-NaN NDVI values: 1347668
Date: 2022-09-11T00:00:00.000000000 - Number of non-NaN NDVI values: 1164327
Date: 2022-09-16T00:00:00.000000000 - Number of non-NaN NDVI values: 1358083
Date: 2022-09-21T00:00:00.000000000 - Number of non-NaN NDVI values: 1160230
Date: 2022-09-26T00:00:00.000000000 - Number of non-NaN NDVI values: 1369278
Date: 2022-10-01T00:00:00.000000000 - Number of non-NaN NDVI values: 908823
Date: 2022-10-06T00:00:00.000000000 - Number of non-NaN NDVI values: 954979
Additionally we plot the date of the first non-nan value occurence of each pixel overall cubes. Most pixel’s first non-nan value occurence is on the 2017-01-15. And the latest occurence is on the 2017-05-05. This might motivate us to use the 2017-05-05 as starting value for our training data time series.
# List to store the first non-NaN time indices for all pixels
first_non_nan_indices = []
# Loop over all NetCDF files
for nc_file in nc_files:
nc_path = os.path.join(file_path, nc_file)
# Load the current NetCDF file
data = xr.open_dataset(nc_path)
ndvi_data = data['NDVI'].values
# Get the size of the data
time_len, x_len, y_len = ndvi_data.shape
# Find the first non-NaN time index for each pixel
for x in range(x_len):
for y in range(y_len):
first_non_nan_time_index = np.nan
for t in range(time_len):
if not np.isnan(ndvi_data[t, x, y]):
first_non_nan_time_index = t
break
first_non_nan_indices.append(first_non_nan_time_index)
# Convert time indices to date values
start_date = pd.Timestamp('2016-01-01')
dates = [start_date + pd.Timedelta(days=int(index) * 5) for index in first_non_nan_indices if not np.isnan(index)]
# Count occurrences of each date
date_counts = pd.Series(dates).value_counts().sort_index()
# Plot the distribution of first non-NaN dates as a bar plot
plt.figure(figsize=(12, 6))
plt.bar(date_counts.index, date_counts.values, edgecolor='black', alpha=0.7)
plt.title('Distribution of First Non-NaN NDVI Dates for All Pixels')
plt.xlabel('Date')
plt.ylabel('Count of Pixels')
plt.grid(True)
plt.xticks(rotation=45) # Rotate date labels for better readability
plt.tight_layout()
plt.show()# Calculate the median of the date values
median_date_num = np.median([date.toordinal() for date in dates])
median_date = pd.Timestamp.fromordinal(int(median_date_num))
median_dateTimestamp('2017-01-15 00:00:00')
max(dates)Timestamp('2017-05-05 00:00:00')
Analyzing the distribution of non-NaN NDVI values over time reveals regular intervals during which fewer than 600,000 (of a possible 128 x 128 x 100 = 1,638,400 - so less than 40%) pixels show valid values for some measurement periods (every 5 days). This observation highlights the poor data quality and suggests that the prediction performance of our time series models will be significantly negatively affected by this substantial amount of missing data.
However, we have chosen to start our analysis from July 2017, as the first period of missing data (< 600k valid pixel values) ends here, and the next three measurement dates contain more than 1,000,000 valid pixels. As depicted by the red LOESS trend line in the graph, the number of valid pixels reaches a higher and more consistent level starting from this point, except for some subsequent periods with a low number of valid pixel values, particularly at the end of 2018 and the beginning of 2019.
Thus, we set our training period from July 4, 2017, to June 28, 2021, and the test period from July 3, 2021, to October 6, 2022. This ensures we have three complete seasonal cycles (apart from the periods with few data) available for training our models.
import matplotlib.dates as mdates
import statsmodels.api as sm
# Extract data for plotting
dates = list(results.keys())
non_nan_counts = list(results.values())
# Convert dates to a readable format if they are in string format
if isinstance(dates[0], str):
dates = [np.datetime64(date) for date in dates]
# Convert dates to a numeric format for LOESS
numeric_dates = mdates.date2num(dates)
# Fit LOESS model
loess = sm.nonparametric.lowess(non_nan_counts, numeric_dates, frac=0.1)
# Plot the results as a histogram and a LOESS smoothed line
plt.figure(figsize=(10, 6))
plt.bar(dates, non_nan_counts, alpha=0.6, label='Non-NaN NDVI values')
plt.plot(dates, loess[:, 1], color='red', linewidth=2, linestyle='--', alpha=0.7, label='LOESS trend')
plt.xlabel('Time')
plt.ylabel('Number of non-NaN NDVI values')
plt.title('Number of non-NaN NDVI values over time')
plt.xticks(rotation=45)
plt.grid(True)
# Set x-axis to display quarterly ticks
ax = plt.gca()
ax.xaxis.set_major_locator(mdates.MonthLocator(interval=3)) # Set major ticks to every 3 months (quarterly)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%m %Y')) # Format the ticks to show the month and year
# Set y-axis lower bound to 500,000
plt.ylim(bottom=500000)
plt.xticks(rotation=45) # Rotate date labels for better readability
plt.tight_layout()
plt.legend()
plt.show()In this step, we will split the time series data into separate train and test datasets for each cube.
# Define Output directories
output_folder_train = base_dir / "data" / "data_train"
output_folder_test = base_dir / "data" / "data_test"# Perform Train-Test Split and Save in Separate Files
for idx, nc_file in enumerate(nc_files, start=1):
full_path = os.path.join(file_path, nc_file)
# Load the NetCDF file
dataset = xr.open_dataset(full_path)
# Train-Test Split
train_data = dataset.sel(time=slice('2017-07-04', '2021-06-28'))
test_data = dataset.sel(time=slice('2021-07-03', '2022-10-06'))
# Generate the new filenames and full paths for the output files
base_filename = os.path.splitext(os.path.basename(nc_file))[0] # Remove the .nc extension
train_filename = os.path.join(output_folder_train, f'{base_filename}_train.nc')
test_filename = os.path.join(output_folder_test, f'{base_filename}_test.nc')
# Save Train and Test Data
train_data.to_netcdf(train_filename)
test_data.to_netcdf(test_filename)
print(f"[{idx}/{len(nc_files)}] Train and test datasets saved for {os.path.basename(nc_file)}")Given the significant number of NaN values in our dataset and time series, we have decided to implement two distinct approaches to effectively handle these missing values.
Approach A: Handling NaNs with Outliers and Cloud Mask Integration
Approach B: Interpolation with STL Decomposition
STL Interpolation:
Why Use STL Interpolation:
# Define input and output dirs
file_path = base_dir / "data" / "data_train"
output_path = base_dir / "data" / "data_A_9999"nc_files = [os.path.join(file_path, f) for f in os.listdir(file_path) if f.endswith('.nc')]
# Iterate over each .nc file and replace NaNs in the NDVI variable with -9999
for idx, nc_file in enumerate(nc_files, start=1):
# Load the NetCDF file
dataset = xr.open_dataset(nc_file)
# Replace NaNs in the NDVI variable with -9999
dataset['NDVI'] = dataset['NDVI'].fillna(-9999)
# Generate the new filename and the full path for the output file
output_file = os.path.join(output_path, f'ds_A_{os.path.basename(nc_file)}')
# Save the modified dataset
dataset.to_netcdf(output_file)
print(f"[{idx}/{len(nc_files)}] Cube: {os.path.basename(nc_file)} | NaNs replaced and saved")STL Interpolation
# Paths to the directory containing the NetCDF files
file_path = base_dir / "data" / "data_train"
output_path = base_dir / "data" / "data_B_interpolated"
os.makedirs(output_path, exist_ok=True)# List all NetCDF files in the input directory
nc_files = [f for f in os.listdir(file_path) if f.endswith('.nc')]
# Iterate over each file
for nc_file in nc_files:
full_path = os.path.join(file_path, nc_file)
ds = xr.open_dataset(full_path)
interpolated_data = []
# Iterate over each pixel and apply STL interpolation
for x in ds.x:
for y in ds.y:
ndvi_pixel = ds['NDVI'].sel(x=x, y=y)
if ndvi_pixel.isnull().all():
# print(f"Pixel (x={x}, y={y}) contains only NaNs and will be skipped.")
interpolated_data.append(np.full(ndvi_pixel.shape, np.nan)) # Append NaN array
else:
interpolated_ndvi = stl_interpolate(ndvi_pixel)
interpolated_data.append(interpolated_ndvi.values)
# Reshape the interpolated data to match the original dimensions
interpolated_ndvi_array = np.array(interpolated_data).reshape((ds.sizes['x'], ds.sizes['y'], ds.sizes['time'])).transpose(2, 0, 1)
# Create a new DataArray with the interpolated NDVI values
interpolated_da = xr.DataArray(interpolated_ndvi_array, coords=[ds.time, ds.x, ds.y], dims=['time', 'x', 'y'], name='NDVI')
# Create a new Dataset with the interpolated NDVI values and original attributes
new_ds = xr.Dataset({'NDVI': interpolated_da, 'Cloud_Mask': ds['Cloud_Mask']}, coords=ds.coords, attrs=ds.attrs)
# Save the new dataset to a NetCDF file
output_file = os.path.join(output_path, f'ds_B_{os.path.basename(nc_file)}')
new_ds.to_netcdf(output_file)
print(f"Interpolated file saved: {output_file}")The data preparation process was extremely time-consuming and challenging. This notebook is the final result of various approaches, ideas, and steps. A significant portion of our work involved experimenting with different strategies which do not appear in this notebook as they did not work out. We encountered several key challenges:
Data Volume: Initially, we started with approximately 5000 cubes. However, executing some steps, such as the initial NDVI calculation based on Sentinel-2 data, was infeasible due to limited computing power, leading to frequent process failures. Consequently, we reduced the dataset to 100 cubes, selecting those with the fewest missing values. Despite this reduction, many steps, like interpolating missing values, were time-intensive and often failed due to memory constraints, prolonging our work.
Data Format: None of us had prior experience with DataCubes, requiring time to understand the data format and how it was stored and loaded (see src/data_processing/my_loader.py). The lack of proper documentation for the DataLoader and MiniCubes meant we had to rely heavily on trial and error to correctly load the data in the desired format.
Data Quality and Missing Data: The dataset contained many missing values, attributed to Sentinel-2 data periods with missing band values and pixels masked by the cloud mask. After deliberation, we decided to exclude the entire year 2016 due to its extensive missing values, despite metadata indicating the measurement period started on January 1, 2016. Further investigation revealed periodic intervals with few valid pixel values, detrimental to time series prediction and definetly will negatively impact our models’ predictive performance. To address missing values, we tested two approaches:
During the process of building the models, we encountered several problems.
Data for approach A
Procedure for the LSTM model:
After the model was created and the predictions were made, the data was denormalized. When looking at the results, we noticed that all prediction results were disproportionately high (> 0.99). Due to the natural range of the NDVI from -1 to 1 and studying the baseline data, we concluded that the results are illogical and must be incorrect. Unfortunately, due to the limited time and the long processing times of the code, we were ultimately unable to determine the exact cause and find an appropriate solution. It is possible that the masking of the -9999 values did not work correctly, so that the -9999 values were regarded as “normal” values by the model. Incorrect normalization could also have been a possible reason.
Due to the comparability, we then decided to completely exclude this data set for all models from our project.
Reducing Cubes: complete cubes Furthermore, our initial aim was to use all 100 cubes for our project. This quickly proved to be impossible with the memory and computing power available to us. Despite the use of GPU, only very few cubes could be processed. We therefore had to reduce our data basis to 4 cubes. We sorted the 100 cubes in descending order of completeness. Complete means that the cube contains as few pixels as possible that contain NaN values across all timesteps. The 4 best cubes are therefore the following: